AI Coding Agent Security: A Defense-in-Depth Reference Architecture

A practical, layered security framework for organizations deploying AI coding agents in development workflows.

Who This Is For

Security teams establishing guardrails for AI-assisted development
Engineering leads evaluating and deploying AI coding tools safely
Developers using Claude Code, GitHub Copilot CLI, Cursor, Windsurf, or similar agents
CISOs and compliance officers building governance around agentic AI

The Threat in Numbers

The attack surface is real, measured, and growing.

41--83% prompt injection success rates across major AI coding platforms

30+ CVEs disclosed across Claude Code, Copilot, and Cursor in 2025--2026

36% of AI agent skills contain security flaws (Snyk ToxicSkills, Feb 2026)

72%+ tool poisoning attack success rate on MCP servers

20% of AI-generated code references hallucinated (non-existent) packages

First documented AI-orchestrated espionage campaign -- GTG-1002, Sep 2025, targeting ~30 entities with 80--90% automation

These are not theoretical risks. Every statistic above comes from published research or disclosed incidents within the last twelve months.

Defense-in-Depth Layer Model

Security for AI coding agents cannot rely on a single control. The following model nests eleven layers from infrastructure (outermost, strongest isolation) to governance (innermost, organizational controls). A failure at any single layer is contained by the layers surrounding it.

graph LR
    subgraph L1["Layer 1: Infrastructure Isolation"]
        subgraph L2["Layer 2: OS Sandboxing"]
            subgraph L3["Layer 3: Network Controls"]
                subgraph L4["Layer 4: Filesystem & Permissions"]
                    subgraph L5["Layer 5: Secrets Management"]
                        subgraph L6["Layer 6: Human-in-the-Loop"]
                            subgraph L7["Layer 7: Prompt Injection Defense"]
                                subgraph L8["Layer 8: Supply Chain Security"]
                                    subgraph L9["Layer 9: MCP Security"]
                                        subgraph L10["Layer 10: Audit & Observability"]
                                            L11["Layer 11: Governance & Frameworks"]
                                        end
                                    end
                                end
                            end
                        end
                    end
                end
            end
        end
    end

    style L1 fill:#1a1a2e,color:#fff
    style L2 fill:#16213e,color:#fff
    style L3 fill:#0f3460,color:#fff
    style L4 fill:#533483,color:#fff
    style L5 fill:#e94560,color:#fff
    style L6 fill:#f39c12,color:#000
    style L7 fill:#2ecc71,color:#000
    style L8 fill:#3498db,color:#fff
    style L9 fill:#9b59b6,color:#fff
    style L10 fill:#1abc9c,color:#000
    style L11 fill:#ecf0f1,color:#000

Security Controls Summary Matrix

Each layer links to a detailed document covering threats, controls, implementation guidance, and tool-specific notes.

Layer	Threats Mitigated	Key Tools / Techniques	Difficulty	Impact
1. Infrastructure Isolation	Full system compromise, container escape	Codespaces, Firecracker, E2B, Docker	High	Highest
2. OS Sandboxing	Unauthorized syscalls, file access beyond scope	Seatbelt, bubblewrap, seccomp, Landlock	Medium	Very High
3. Network Controls	Data exfiltration, C2 communication, SSRF	Egress filters, agent proxies, Pipelock	Medium	Very High
4. Filesystem & Permissions	Unauthorized file access, overprivileged operations	`.aiignore`, allowlists, least privilege	Low	High
5. Secrets Management	Credential theft, env variable leakage	Env scrubbing, credential brokers, short-lived tokens	Medium	High
6. Human-in-the-Loop	Destructive actions, unintended modifications	Approval workflows, permission modes	Low	High
7. Prompt Injection Defense	Direct/indirect injection, instruction hijacking	Input validation, output filtering, confirmation gates	High	High
8. Supply Chain Security	Hallucinated dependencies, malicious packages	Lockfiles, SBOM, dependency scanning, code review	Medium	High
9. MCP Security	Tool poisoning, confused deputy, token theft	Mutual auth, scoped OAuth, server-side authz	High	High
10. Audit & Observability	Undetected breaches, compliance violations	Activity logging, behavioral analysis	Medium	Medium
11. Governance & Frameworks	Organizational gaps, regulatory non-compliance	OWASP, OpenSSF, SAIF, NIST	Low	Medium

Quick Start

Not sure where to begin? The top 10 actions you can take today to reduce your exposure:

Run AI agents inside disposable containers or Codespaces -- never on your host machine
Enable the built-in sandbox (Claude Code ships with Seatbelt on macOS)
Block outbound network access except to an explicit allowlist
Add a .aiignore file to every repository
Remove long-lived credentials from agent-accessible environments
Set your agent to ask-before-executing mode for destructive operations
Pin all dependencies and verify lockfiles after AI-generated changes
Audit every MCP server connection -- remove any you did not explicitly install
Enable activity logging and route agent actions to your SIEM
Adopt an AI-acceptable-use policy and train developers on injection risks

See checklists/quickstart.md for the full annotated checklist with copy-paste commands.

Tool-Specific Hardening Guides

Each guide covers the agent's permission model, sandbox configuration, known CVEs, and recommended settings.

Claude Code -- sandbox modes, permission scopes, .claude/settings.json hardening
GitHub Copilot -- Copilot CLI and agent mode, policy controls, enterprise settings
Cursor -- YOLO mode risks, rule files, network and filesystem restrictions

Incident Timeline

A condensed record of significant public incidents involving AI coding agents and their ecosystems.

Date	Incident	Impact
May 2025	GitHub MCP vulnerability -- malicious commands in Issues	Private source code and key exfiltration
Sep 2025	GTG-1002 AI-orchestrated espionage (Anthropic)	~30 entities targeted, 80--90% automated
Oct 2025	Postmark-MCP supply chain attack	~300 organizations compromised
Oct 2025	CVE-2025-59536: Claude Code RCE via project files	Remote code execution
Nov 2025	GitHub Copilot CVEs (path traversal, output validation)	Unauthorized file access
Feb 2026	Agents of Chaos study -- 11 agent failure modes	Academic proof of systemic risks
Feb 2026	Snyk ToxicSkills -- 36% of AI skills have security flaws	76 confirmed malicious payloads

Contributing

Contributions are welcome. This is a living document -- the threat landscape evolves weekly, and community input keeps it current.

Open an issue to report inaccuracies, suggest new layers, or flag emerging threats
Submit a pull request to improve existing guides or add tool-specific hardening notes
See individual layer docs for areas marked as needing expansion

License

This work is licensed under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).

You are free to share and adapt this material for any purpose, including commercial, provided you give appropriate credit and distribute contributions under the same license.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
checklists		checklists
diagrams		diagrams
docs		docs
examples		examples
.aiignore		.aiignore
.cursorignore		.cursorignore
.gitattributes		.gitattributes
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Coding Agent Security: A Defense-in-Depth Reference Architecture

Who This Is For

The Threat in Numbers

Defense-in-Depth Layer Model

Security Controls Summary Matrix

Quick Start

Tool-Specific Hardening Guides

Incident Timeline

Further Reading

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Coding Agent Security: A Defense-in-Depth Reference Architecture

Who This Is For

The Threat in Numbers

Defense-in-Depth Layer Model

Security Controls Summary Matrix

Quick Start

Tool-Specific Hardening Guides

Incident Timeline

Further Reading

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages